Perfect Sampling with Unitary Tensor Networks 
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Tensor network states are powerful variational ansatze for many-body ground states of quantum lattice mod- 
els. The use of Monte Carlo sampling techniques in tensor network approaches significantly reduces the cost 
of tensor contractions, potentially leading to a substantial increase in computational efficiency. Previous pro- 
posals are based on a Markov chain Monte Carlo scheme generated by locally updating configurations and, as 
such, must deal with equilibration and autocorrelation times, which result in a reduction of efficiency. Here we 
propose perfect sampling schemes, with vanishing equilibration and autocorrelation times, for unitary tensor 
networks - namely tensor networks based on efficiently contractible, unitary quantum circuits, such as unitary 
versions of the matrix product state (MPS) and tree tensor network (TTN), and the multi-scale entanglement 
renormalization ansatz (MERA). Configurations are directly sampled according to their probabilities in the 
wave-function, without resorting to a Markov chain process. We consider both complete sampling, involving 
all the relevant sites of the system, as well as incomplete sampling, which only involves a subset of those sites, 
and which can result in a dramatic (basis-dependent) reduction of sampling error. 

PACS numbers: 05.10.-a, 02.50.Ng, 03.67 .-a, 74.40.Kb 



I. INTRODUCTION 

To the computational physicist interested in one- 
dimensional quantum lattice models, the density matrix 
renormalization group (DMRG)i2 is a dream come true. 
It provides an essentially unbiased, extremely accurate 
variational approach to ground state properties of a large 
class of local Hamiltonians in one dimensional lattices. 
DMRG operates by approximating the ground state of the 
system with a matrix product state (MPS)2r— , which is a 
simple tensor network with tensors connected according to 
a one-dimensional array. In recent years, the success and 
broad applicability of DMRG has been understood to follow 
from (i) the existence of a characteristic, universal pattern of 
entanglement common to most ground states in one spatial 
dimension; and (ii) the ability of the MPS to reproduce this 
universal pattern of entanglement, thanks to having its tensors 
connected into a one-dimensional geometry. 

The above insight has since then guided the development 
of new tensor network approaches that aim to repeat, in other 
geometries or physical regimes of interest, the unprecedented 
success of DMR G 1 ' 2 ' 7 ' 8 in one dimension. The recipe is quite 
simple: first, identify a pattern of entanglement common to 
a large class of ground states; then, connect tensors so that 
they can reproduce this pattern, and use the resulting tensor 
network as a variational ansatz. In this way the multi-scale, 
layered pattern of entanglement observed in ground states 
near a continuous quantum phase transition motivated the pro- 
posal of the multi-scale entanglement renormalization ansatz 
(MERA)2i!£ to address quantum critical phenomena. Simi- 
larly, the characteristic spatial pattern of entanglement in the 
ground states in two and higher dimensions motivated higher- 
dimensional generalizations of both the MPS (known as pro- 
jected entangled-pair states, PEPSii— ) and the MERA-H— . 

The cost of simulating a lattice of L sites with any of the 
above tensor networks is roughly proportional to L, which un- 



derlies the efficiency of the approaches 2 ^. Importantly, how- 
ever, this cost also grows as 0(x p ), that is as a power p of the 
dimension \ of the indices connecting the tensors into a net- 
work. On the one hand, this bond dimension \ determines the 
size of the tensors and therefore the number of variational pa- 
rameters contained in the tensor network ansatz. On the other, 
X is also a measure of how much entanglement the tensor net- 
work can carry. It then follows that the cost of simulations 
increases with the amount of entanglement in the ground state 
of the system. Entanglement is indeed the key factor limiting 
the range of applicability of tensor network approaches. 

More specifically, for an MPS, a small power p, namely 
P mps = 3, implies that very large values of \ (°f U P to a 
few thousands) can be considered even with a high-end desk- 
top computer. Correspondingly, DMRG can address one- 
dimensional systems with robustly entangled ground states. In 
contrast, the cost of two dimensional simulations with PEPS 
and MERA scales with a much larger power p of x< e -g- 
p p E p S = 12 in Ref. [l4|and p MERA = 16 in Ref. [2ll and this con- 
siderably reduces the affordable values of x- In other words, 
PEPS and MERA calculations have so far been restricted to 
systems with relatively small amounts of ground state entan- 
glement. A major present challenge for these approaches is 
to obtain more efficient tensor contraction schemes that could 
lower their cost. 

A possible route to reducing the scaling of computational 
cost with x m tensor network algorithms is by u sing Monte 
Carlo sampling techniques, as proposed in Refs. I23l - l25l As 
reviewed in the next section, the cost of manipulating the ten- 
sor network (for a single sample) is reduced to 0(x q ), where 
q is significantly smaller than p (typically of the order of p/2). 
The proposals in Refs. [23tf24] are best suited for tensor net- 
works, such as MPS and PEPS, where the coefficients in the 
tensors are unconstrained. However, in the MERA, as well as 
in other unitary tensor networks such as unitary versions of 
MPS (uMPS) and of tree tensor network 2 ^ (uTTN), tensors 
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are subject to unitary constraints. 



II. BACKGROUND MATERIAL: SAMPLING IN TENSOR 
NETWORK ALGORITHMS 



The purpose of this paper is to address the use of Monte 
Carlo sampling in the context of unitary tensor networks, in- 
cluding uMPS, uTTN and MERA. [Notice that this excludes 
tensor networks such as a periodic MPS or PEPS, which can- 
not be generically re-expressed as a unitary tensor network]. 
An important difference with respect to Refs. [231241 is that 
in a unitary tensor network, sampling is performed on an ef- 
fective lattice corresponding to the past causal cone of the lo- 
cal operator whose expectation value is being computed. This 
means that sampling typically occurs over some reduced num- 
ber of sites (less than the system size L). A second difference 
is that in unitary tensor networks there is no need to use a 
Markov chain Monte Carlo scheme. Indeed, our main result 
is the proposal and benchmark of perfect sampling schemes 
for unitary tensor networks, by means of which one can ob- 
tain completely uncorrected samples directly according to the 
correct probability. Therefore, one can sample without incur- 
ring additional computational costs due to equilibration and 
autocorrelations times. This is particularly of interest near a 
quantum phase transition, where equilibration and autocorre- 
lation times diverge with system size L. We consider both 
complete (perfect) sampling and incomplete (perfect) sam- 
pling schemes. In the former, the indices for all sites of the 
effective lattice are sampled. In the latter, only the indices of a 
subset of sites is sampled, while the indices of the rest of sites 
are contracted exactly, with an insignificant or minor increase 
of computational cost as far as the scaling 0(x 9 ) is concerned. 
Importantly, the statistical variance (due to sampling) of an 
expectation value obtained with incomplete sampling can de- 
crease dramatically with a proper chose of sampling basis, as 
illustrated in Fig. |9]with a drop of 10~ 7 in error. 



The paper is organized in sections as follows. First, in sec- 
tion |n] we briefly review the use of Monte Carlo sampling 
techniques to evaluate the expectation value of local opera- 
tors in context of tensor networks, and introduce the notions 
of complete and incomplete sampling. Then in sectionHITlwe 
explain how the proposals of Refs. 23ll24l can be adapted to 
the case of a unitary tensor network by sampling within the 
past causal cone of the local operator. In section [TV] we pro- 
pose a complete perfect sampling scheme for unitary tensor 
networks. Its performance is demonstrated for a uMPS with 
the quantum Ising chain at criticality. In section [V] we then 
present an incomplete perfect sampling scheme. We discuss 
computational costs in section [VI] The conclusions in Sec- 
tion IVIII and an Appendix analyzing the variance in different 
schemes close the paper. 



We emphasize that this paper is only concerned with the 
evaluation of local expectation values from a unitary tensor 
network. That is, here we assume that the unitary tensor net- 
work has already been optimized and focus on how to extract 
information from it. The optimization of unitary tensor net- 
works using variational Monte Carlo is discussed in Ref. H^. 



Let us start by introducing our notation and by reviewing 
some basic concepts. 



A. Exact contraction versus sampling 



Let £ be a lattice made of L sites, with vector space 
Vc = <gif—iV, where V is the d-dimensional vector space 
of one site. Let | fy) £ Vc denote the wave-function encoded 
in the tensor network and let Abe a local operator on Vc- An 
important task in tensor network algorithms is to compute the 
expectation value which can be expressed as 



(*iii*)=x;<*i8><8|ii*) ) 



(i) 



where |s) = \s{) ® \s2) <8> • ■ ■ ® |sl) denotes a product 
state of the L sites of the lattice, with Sj = 1, 2, • • • , d la- 
belling the elements of an orthonormal basis {|si)} on site i, 
i = 1, 2, • • • , L. Here, S is the set of all d L possible con- 
figurations s = (si, S2, ■ ■ ■ , Sl) of the system. The expecta- 
tion value of Eq. ([TJ can be obtained exactly by contracting 
the corresponding tensor network. However, a large compu- 
tational cost motivates the search for an alternative approach 
based on sampling. 

In preparation for an approximate evaluation of the expec- 
tation value (^lAl^), let us first introduce the probability 
Q{s) = | (s | | 2 of projecting state into the product state 
|s), and the estimator A(s) = (s\A\^)/(s\^>), and rewrite 
Eq. (fl~|i as 



<*|i|*)=^Q(s) J 4(s). 



(2) 



This expression emphasizes that (^\A\ ty) can be regarded as a 
probabilistic average of estimator A(s) according to the prob- 
abilities Q(s), where Q(s) > 0, J2 seS Q(s) = 1. 



Let us replace the sum over the set S of all |<S| 



d L 



figurations s with a sum over some subset S C S containing 
./V = |<S| configurations s, where N < d L , that is 



*|i|*)«l£Q(s)A(s), 



(3) 



sES 



where Z = X^ses Q( s ) i s a normalization factor. Eq. (0 
states that an approximate evaluation of (\f'|.A|\I>) is obtained 
by considering a probabilistic sum over TV configurations s. If 
the configurations in S have been randomly chosen from S 
according to the probability Q(s), then importance sampling 
allow us to replace the previous expression with 



(4) 
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FIG. 1: (Color online) Contraction of a tensor network, (a) Tensor 
network corresponding to the expectation value ( $ j A | $ ) , with a sum 
over (or exact contraction of) indices si, S2, • • • ,sa (exact contrac- 
tion). Contracting this tensor network has a cost that scales as 0(x P ) 
with the bond index \, for some power p. (b) Tensor networks corre- 
sponding to ($|s)(s|j4|4') for a given configuration s, correspond- 
ing to a single sample. The cost of contracting these two networks 
scales as 0(x Q ) with the bond index \, where power q is smaller 
than power p. (c) Tensor network corresponding to {\& j s°) (s° | ^4 IMf) 
for a given incomplete configuration s° = (si, S2, S3) (these three 
indices are being sampled), where in addition there is a sum over (or 
exact contraction of) indices 54, S5 and s@. The cost of contracting 
this tensor network scales as 0(x q ), with q' somewhere between q 
and p. 

Equation © estimates by means of N indepen- 

dent samples of a random variable (A(s), Q(s)). By construc- 
tion, the mean A of this random variable, 

A = ^2Q(b)A(s), (5) 
ses 

is given by the expectation value (5 , |^4|^E') of operator A, see 
Eq. (f2]i. Notice that, in addition, its variance a\, defined by 

a\ = £Q(sMs)-i| 2 (6) 

S 

= ^Q(s)|^(s)| 2 -|A| 2 , (7) 

S 

also equals the variance o 1 ^ of operator A, 

a\ = (*|(|i-(*|l|*)| 2 )|*) (8) 

= (*|(|i| 2 )|*)-|(*|i|*}| 2 , (9) 

that is a A = crl, see Appendix. It follows that the error 
€a(N) in the approximation of Eq. ©, as measured by the 



standard deviation <ja/VN of N independent samples, scales 
with N as 

[°\ 

e A (N)^^. (10) 

Let us analyze in which sense the above Monte Carlo sam- 
pling strategy could be of interest. The cost (i.e. computa- 
tional time) of an exact contraction, Eq. (Q]), scales as 0(x p ) 
with the bond dimension x- On the other hand, notice that for 
each specific configuration s, the contribution ( 1 J r |s)(s|A|4 r ) 
to (\J , |A|\D') consists of two tensor networks, namely one for 
(vE^s) and another for (s|A|^), whose contraction can be ac- 
complished with a cost 0(x q ), for some q < p, see Fig. Q] 
[This is also the cost of computing Q(s) and A(s) in Eq. @]. 
If the number of samples required to obtain an acceptably 
small error €a(N) is N rj 0(x q ), the use of sampling in- 
curs a computational cost of 0(x 9+q ) instead of 0(x p )- We 
conclude that if q + q' < p, then (for large x) the sampling 
strategy will have a lower computational cost than the exact 
contraction. 



B. Combining exact contraction with sampling: 
Incomplete sampling 

More generally, one can consider a hybrid strategy which 
combines exact contraction and sampling. This is accom- 
plished by sampling over only a subset of the L indices cor- 
responding to the L sites of lattice C, while performing an 
exact contraction on the remaining sites. For instance, Fig. 
[He) considers a lattice C made of L = 6 sites where the first 
three sites are being sampled, with configuration (si, S2, S3), 
whereas the remaining three of sites are being addressed with 
an exact contraction. 

If we denote by s° <G S° a configuration of the V indices 
to be sampled (V < L), then Eq. Q]is replaced with 

(*iii*>= (*is°)(sw>, (id 

We can again rewrite Eq. (fTTb as a probabilistic sum of an 
estimator A°(s°) = (^|s <> )(s <> |i|*)/|(1'|s )| 2 according to 
probabilities Q(s°) = K^s )! 2 , 

= Q(sW)- (12) 

Similarly, we could generalize Eqs. [3]|4]and apply importance 
sampling. We note that in this case the variance <j\o, defined 
by 

a% = £Q(s*)K(s e )-l| 2 (13) 
= YQ( S °)\ A ( S<> )\ 2 -\A\ 2 , (14) 

might be smaller than the variance a 2 , of operator A (Eq. 
[5), since a single incomplete sample s corresponds to many 
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complete samples s. [For instance, in the example of Fig. 
UXc), the incomplete sample s° = (s l7 s 2 , S3) corresponds to 
all complete samples s = (sx, S2, S3, S4, S5, Se) that coincide 
with s° in the first three sites.] In other words, the statistical 
error might be reduced. This should not come as a surprise. 
After all, in the extreme case where no sampling at all is per- 
formed (L° = 0) but all indices are exactly contracted, there 
is no statistical error left. 



C. Markov chain Monte Carlo 

In Refs. I2I24I the random configurations s were generated 
by means of a Markov chain process based on local updates. 
Given a stored configuration s, let us denote s' ; a configuration 
obtained from s by replacing in site i the value Sj with s[. 
Then, visiting the sites sequentially, i = 1, 2, • • ■ , L, in what 
is known as a sweep, a change on site i is introduced according 
to the Metropolis probability 

Pctage = min[ ^y^- , 1]. (15) 

In this way, after one sweep a new configuration s' is obtained 
from s, and by iteration a sequence of configurations 

s -> s' -> s" -> • ■ ■ (16) 

is produced. However, these configurations will in general be 
correlated. The number r of sweeps required between config- 
urations s and s' in order for them to be essentially indepen- 
dent to be independent is known as the autocorrelation time. 
Sweeping r times between samples is necessary in order for 
the error ca(N) to scale as in Eq. ( [Tot , since that expression 
for the error assumed the samples to be independent. (If only 
a single sweep mediates the samples, the statistical error in 
Eq. ( [Tol l increases by a factor which scales as r 1 / 2 due to au- 
tocorrelations). In addition, the first sample s will be obtained 
after applying r' sweeps to some random initial configura- 
tion. The equilibration time r' is necessary in order to guaran- 
tee that the first sample is picked-up according to the correct 
probability distribution. The autocorrelation time r and the 
equilibration time t' are known to diverge with systems size 
L for critical systems. 

Large equilibration and autocorrelation times, e.g. near or 
at a critical point, increase the cost of simulations. This in- 
crease can be prevented if somehow independent configura- 
tions s can be directly generated according to probabilities 
Q(s). In section [TV] we show how this is possible for a spe- 
cific class of tensor networks, namely unitary tensor networks, 
which are introduced next. 



III. SAMPLING OF UNITARY TENSOR NETWORKS 

Let us specialize to the particular case of unitary tensor net- 
works, namely tensor networks that are based on a unitary 
quantum circuit. Examples include the MERA and unitary 




FIG. 2: (Color online) Sampling in a unitary matrix product state 
(uMPS). (a) uMPS for a state |*) of lattice £. Notice the (fictitious) 
time direction, which provides each tensor with a sense of which 
indices are incoming and which are outgoing, (b) The past causal 
cone C of a local operator A acting on a single site of £ (denoted by 
a discontinuous circle) defines an effective lattice £ c , which is found 
in state |^/ c ). Notice that the effective lattice £ c is made of two types 
of sites, namely sites already present in the original lattice £ and one 
site not present in £, with d-dimensional and ^-dimensional vector 
spaces, respectively, (c) Tensor networks representing (\E , |A|\E r ) and 
("3/ c I j4| \[/ c ) . The inset shows unitarity reductions [Eq. U7H used to 
transform into (* C |A|* C ). 



versions of MPS (with open boundary conditions) and TTN, 
which we will refer as uMPS and uTTN 32 . 

Unitary tensor networks are special in that each tensor u 
is constrained to be unitary /isometric. Figs. [2] and [3] exem- 
plify the discussion for uMPS and uTTN respectively. Specif- 
ically, we first note that in one such tensor network there is a 
well-defined direction of time throughout, see e.g. Figs. |2|a) 
and [3^ a). Each index of a tensor u is either an incoming in- 
dex (if time flows towards the tensor) or an outgoing index 
(if time flows away from the tensor). The constraint on u can 
be expressed in the following way. Let us group all incoming 
indices of u into a composite incoming index a and all out- 
going indices of u into a composite outgoing index f3, so that 
tensor u becomes a matrix up a . Then the unitary /isometric 
constraint on u reads 

^{u^apUpa' = 6 aa >. (17) 



A direct implication of this property is that the tensor net- 
work corresponding to the expectation value (\f r |A|\I/) can be 
replaced with a simplified tensor network where the pairs of 
tensors (it, xv) outside the so-called past causal cone C of A 
have been removed, see Figs. |2|c) and[3]c). This new tensor 
network can be interpreted to represent the expectation value 
(* c |^4|^' c ) of the local operator A on a state \^ c ) e V £ c 
of an effective lattice C c defined by the causal cone C of the 
operator A, see Figs. |2]b) and Ob), where by construction 
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FIG. 3: (Color online) Sampling in a unitary tree tensor network 
(uTTN). (a) uTTN for a state |*> of lattice C. (b) Effective lattice 
C c . (c) Tensor networks for (*|0|*> and (* c |„4|* c ). The inset 
shows a reduction due to the unitary constrain of tensors in the uTTN. 



= (* c |i|* c ). The effective lattice C c is made of 
L c sites that can be of two types: those already contained in 
the original lattice C, which are described by a rf-dimensional 
vector space, and those which did not belong to C, which 
are described by a %-dimensional vector space. We use r = 

, ra, ■ ■ ■ , r L c ) to denote a configuration of the effective lat- 
tice C c , and |r) = |ri) <E> \r2) <& ■ • • ® Vl c ) tne corresponding 
product vector, where for some sites = 1, 2, ■ • • , d and for 
some others = 1, 2, • • • , X- We denote 1Z the set of all 
configurations r. 

The exact contraction of the tensor network corresponding 
to (^f c |^4| x & c ), may still be very expensive and again we might 
be interested in exploring the use of sampling to lower the 
computational cost. For that purpose, we repeat the discussion 
in section HI1 First we write the expectation value (\E , |^4|^E') as 

(*|A|*H^(* c |r)(r|i|vI/ c ), (18) 

see Fig. g]for uMPS and uTTN. Then we rewrite Eq. (Qj]) in 
terms of the estimator A c (r) = (r|-A|\l/ c )/(r|\I rC ) and proba- 
bilities P(r) = |(r|* c )| 2 , 

(tf |A|#> = P(r)A c (r). (19) 

We can again limit the sum over configurations r to a subset 
1Z containing just N configurations which, when chosen from 
1Z randomly according to the probabilities P(r), results in 

<*i4*> 4^ /(r) - (20) 

The error in the approximation scales with N as in Eq. ( [Tol l. 




FIG. 4: (Color online) Graphical representation of (* C |A|* C ) = 
^ rS - R (* C |r)(r|A|* c ). In (a), the original state |*) was repre- 
sented with an uMPS, see Fig. [2] In (b), the original state |$) was 
represented with an uTTN, see Fig. [3] However, in both cases the 
state \^ c ) is represented by an uMPS that runs through the causal 
cone. 



IV. PERFECT SAMPLING 

In this section we describe how to randomly draw configu- 
rations r according to probability P(r) in a unitary tensor net- 
work. We refer to this scheme as perfect sampling because, in 
contrast with Markov chain Monte Carlo, the present scheme 
produces perfectly uncorrelated samples. We will also refer 
to this scheme as complete perfect sampling, to distinguish it 
from the incomplete perfect sampling scheme discussed in the 
next section, where sampling is performed only on a subset of 
sites. 



A. Algorithm 

Recall that as a quantum circuit, the tensor network is 
equipped with a notion of (fictitious) time. From now on we 
assume that the labeling of the sites in the effective lattice C c 
has been chosen so as to progress forward with respect to this 
notion of time. Thus, site 1 corresponds to the earliest time, 
site 2 corresponds to a later time, and so on, until site L c cor- 
responds to the latest time (when two sites correspond to the 
same time, e.g. sites 4 and 5 in Fig. [4] (a), we order them 
arbitrarily). 

Our perfect sampling algorithm consists of sequentially 
computing a series of conditional single-site density matri- 
ces {pi, P2(fi), • ■ ■ } an d conditional single-site probabilities 
{P(ri), P(r2\ri), ■ ■ ■}■ First we compute the reduced density 
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FIG. 5: (Color online) Perfect sampling with a uMPS. The figure 
shows a sequence of the tensor networks corresponding (up to a pro- 
portionality constant) to pi, P(n), p2{r2), P(f*2|fi), and so on, 
see Eqs. d2 14428t . Importantly, all these tensor networks can be con- 
tracted with a cost that scales as 0(x 2 ) with the bond dimension x, 
and are therefore computational less expensive than an exact contrac- 
tion, which has cost 0(x 3 )- 



matrix p\ for site 1 exactly, i.e. without sampling, 

Pi =tr 2 ... LC {|* C )(* C |} (21) 
from which we can compute the probabilities 

P(n) = (nMn). (22) 

We can then randomly choose a value for r\ according to 
probability P(r\), and compute (exactly) the conditional re- 
duced density matrix p 2 (ri) for site 2, which is obtained from 
the state (ri\<k c ) of sites 2 to L c ', 

fts(ri) = — Urtr 3 ... £ c {(ri|* c )<* c |n)} . (23) 

P(n) 

Again, we can use the reduced density matrix to compute the 
conditional probabilities 



P(r 2 \n) = (r 2 \p 2 (r 1 )\r 2 ), 



(24) 



and we can therefore randomly select a value of r 2 according 
to probabilities P(r 2 |n ). Let us notice at this point that so far 
we have randomly chosen values for n and r 2 according to 
the probability 



P(n,r 2 ) = P(ri)P(r a |n) = ||<ri,r 3 |* 



C\||2 



(25) 



We can now iterate the above process, that is, compute the 
conditional density matrix 



P3(ri,r 2 ) 



1 



P(ri,r 2 ) 



tr 4 ...£c {(ri,r2|* c }(* c |r 1 ,r 2 )} 



and the conditional probabilities 

P(f3ki,r 2 ) = (r 3 |p 3 (ri,r 2 )|r 3 ), (27) 

and so on for the rest of sites in the effective lattice C c . In this 
way, and since 

P(r)=P(r 1 )P(r 2 \r 1 )---P(r L c\r 1 ,r 2 ,--- ,r L c-i), (28) 

we end up indeed randomly choosing a configuration r = 
, i~2, ■ ■ ■ , tl c ) w i m probability given precisely by P(r) = 
l(r|* C )| 2 . 

Fig. [5] illustrates the sequence of computations in the case 
of a one-site operator A specifically for a uMPS, assuming as 
in Figs. |2] and 2[a) that the operator A is supported on the 
fourth site of the original chain. This algorithm is similar to 
one used for thermal state sampling with MPS 25 described in 
Ref.HH Analogous computations for a uTTN are very similar, 
since the causal cone of a single-site operator A is described 
also by a uMPS, see Fig. Ob). For the case of a MERA, more 
details on the implementation of Eqs. ( T2TI - |28l ) can be found in 
Ref.HI 

A key point is that, for unitary tensor networks such as 
uMPS, uTTN, and MERA, the computational cost of gener- 
ating the above sequence of density matrices and probabilities 
often does not exceed (to leading order in \ an d effective size 
L c ) the cost of a single sweep in Markov chain Monte Carle 33 . 

B. Benchmark 

To illustrate the performance of the perfect sampling 
scheme and compare it to Markov chain Monte Carlo, we have 
considered a duly optimized uMPS for the ground state 
of the quantum Ising model with critical transverse magnetic 
field, 



Ising 



E^ 2 



(29) 



(26) 



on an open chain of L spins. 35 The two sampling schemes are 
then used in order to compute the expectation value of local 
operators. 

Fig. |6ja) and (b) show a history of 150 configurations of 
a chain of L = 50 spins obtained with perfect sampling and 
Markov chain Monte Carlo, respectively. The existence of 
correlations in the second case is manifest. 

Fig. |6jc) and (d) show the error in the expectation value 
(*|o-f g |*) and (i|a| 5 |*) for the local operators a z and a x 
on site 25, as a function of the number of samples N. In both 
cases, the effect of autocorrelations in Markov chain Monte 
Carlo results in an error larger than the error obtained with 
perfect sampling, which is given by Eq. ( [Tot . The ratio be- 
tween statistical errors, as given in terms of the autocorrelation 
time r by \J2t + 1, is seen to depend on the choice of local 
operator - this autocorrelation time is larger for (^lafsl^) 
than for (^lafsj*)- 

Finally, Fig. |6](e) and (f) explore the autocorrelation time r 
for a z as a function of the size L of the spin chain. In particu- 
lar, Fig. |6](f) reveals that r grows linearly in L. This means 36 
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FIG. 6: (Color online) Sampling of the ground state of the critical 
transverse Ising model in the z basis. Comparison between configu- 
rations obtained using (a) the presented perfect sampling scheme and 
(b) a Markov chain scheme (single sweep) on 50 sites. Blue sites rep- 
resent spin up and yellow for spin down. The correlations between 
configurations obtained using a Markov chain scheme are evidenced 
by the appearance of domains of well defined color that extend ver- 
tically. In (c) we have calculated the expected statistical error on 
the estimate of (<r z ) for the perfect sampling (blue line) and Markov 
chain sampling (blue dots). While with perfect sampling the error 
decreases with the usual N^ 1 ^ 2 factor, correlations between subse- 
quent samples increase the error on the estimate in the Markov chain 
scheme. In (d) we plot the same for (a x ) by projecting all the spins 
into the x basis. In this case the Markov scheme used utilizes a 2-site 
update so as to be compatible with the wave-function symmetry^. In 
(e) we present the correlations on the centre site (in the z basis) after 
j Markov chain sweeps using 10 6 samples for 50 sites (blue dots) 
and 250 sites (black crosses). In the perfect sampling scheme (blue 
line), there are no correlations between configurations. In (f) we plot 
the estimated autocorrelation time for different system sizes. 



=E 





FIG. 7: (Color online) Graphical representation of (* C |A|* C ) = 
^ r o e - R o{* C |r)(r°|4|* c ) for a uMPS, to be compared with Fig. 
Ufa). Notice that sampling does not affect two of the indices, over 
which an exact contraction is still performed. 



Pi 



\ / v / \ 
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FIG. 8: (Color online) Incomplete perfect sampling with a uMPS. 
The figure shows a complete sequence of the tensor networks cor- 
responding (up to a proportionality constant) to pi, P(n), £2(^2), 
P(f2\ri), pz{r\,r2) and P(ri,r<2 : rz) necessary in order to gen- 
erate a configuration r° = {ri,r2,r^) with probability P(r°) = 
I (*& c |r°) | 2 . Notice that the cost still scales as 0(x 2 ), as in the com- 
plete (perfect) sampling scheme. 



V. INCOMPLETE PERFECT SAMPLING 



that in order to achieve a fixed accuracy in (^>\<j z l j 2 \^), the 
number of samples N with Markov chain Monte Carlo has to 
grow linearly in L, whereas a constant number of samples is 
enough with perfect sampling. 

It is important to stress, however, that the Markov chain 
Monte Carlo update scheme discussed here, based on single 
spin updates, is used as a reference only - more sophisticated 
Markov chain Monte Carlo schemes, based e.g. on global spin 
updates, could lead to smaller autocorrelation times. 



So far we have considered perfect sampling over the whole 
causal cone, that is, over the indices associated to all the sites 
of the effective lattice C c . However, it is also possible to use 
an incomplte perfect sampling scheme, which combines per- 
fect sampling over most of the sites of C and an exact con- 
traction over a small set of sites, without altering the scaling 
0(x q ) of the cost of a single sample. Because we are sam- 
pling over fewer indices, we can expect a decrease in the sta- 
tistical error with little change in the cost. In some cases the 
reduction in statistical uncertainty can be dramatic. 
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A. Incomplete perfect sampling scheme 

The incomplete perfect sampling scheme is illustrated in 
Fig. [7]fbr a uMPS. The first step is to rewrite the expectation 
value (* c |i|1' c ) = as 

= (* C |r°)(r°|i|* c ), (30) 

where TV is the set of incomplete configurations r° = 
{fi, T2, ■ ■ ■ ,i"lo), where L° is the number of sites over which 
sampling takes place, with L° < L c . For the case of the 
uMPS illustrated in Fig. |7l one can perform an exact contrac- 
tion on two sites of C c , namely the site on which the local 
operator A is supported and the effective, x-dimensional site 
corresponding to the bond index of the uMPS. Notice that now 
the term (\I/ c |r <> )(r <> |.A|\E rC ) does not factorize into two terms, 
since (r°|* c ) and (r°|i|* c ) are no longer complex numbers 
but dx-dimensional vectors. 

We can still rewrite Eq. d30b as a probabilitistic sum of an 
estimator A°(r°) = (* c |r )(r |i|* c )/|(* c |r°)| 2 accord- 
ing to probabilities P(r°) = |(* c |r°)| 2 , 

(*|A|*)= Yl i^W), (31) 

limiting the sum over configurations r° to a subset TV con- 
taining just N configurations, and use (perfect) importance 
sampling to obtain the estimate 

An important difference between the incomplete perfect sam- 
pling scheme and the comnplete perfect sampling scheme of 
Eqs. dT8l - [20b is that the estimator A , whose mean is A° = 
(\I r |.A|\I>) as indicated in Eq. PTT ), has a variance a 2 AO , 

= E P{**)\A*(t*)-A*\ 2 (33) 
= E P{^)\A^)\ 2 -\A^, (34) 

r 5 GR° 

that is no longer necessarily equal to the variance a 2 ^ of 
Eq. (O, but is instead upper bounded by it, a\<, < cr|, see 
the Appendix. In other words, the error cao(N) in the ap- 
proximation of Eq. d32| l, given by 

eA«(N)*^5f, (35) 

can be smaller than the error €a (N) of a complete sampling 
scheme. 



B. Algorithm 

We have implemented the incomplete perefect sampling 
scheme in conjunction with the complete perfect sampling 



scheme described in section [IV] We notice, however, that 
incomplete sampling can also be incorporated into Markov 
chain Monte Carlo. 

As in section IIV1 we proceed by constructing a se- 
quence of conditional single-site reduced density ma- 
trices {pi, P2{ri), ■ ■ ■} and conditional probabilities 
{P(r±), P(r2\ri), ■ ■ ■ }. However, in this occasion the 
sequence concludes at site L°, after which we can already 
evaluate the estimator A°(r <> ). This is illustrated for the case 
of a uMPS in Fig. [8] which is to be compared with Fig. [5] 



C. Benchmark 

As in section [TVl we use sampling to compute the expec- 
tation value of local observables from a uMPS with \ — 30 
that has been previously optimized to approximate the ground 
state of the quantum Ising chain at criticality, Eq. d29l ). The 
exact structure that we sample can bee seen in Fig. [7] Fig- 
ure|9]shows the sampling error, as a function of the number of 
samples N, in the computation of (^'lo'lsl^) an d W^isW 
in a chain of L = 50 spins. The error is seen to depend on two 
factors. On the one hand, it depends on which operator (a z 
or a x ) is being measured, as it did in section|IV] In addition, 
now it also drastically depends on which product basis {|r°)} 
is used. In particular, we see that a very substantial reduction 
of sampling error, of seven orders of magnitude, is obtained 
by measuring on the x basis while computing (^lo^l^). It 
should be noted that the two-site Markov chain update scheme 
used for the 2>basis calculations,^ although appears competi- 
tive, is more computationally demanding than the perfect sam- 
pling scheme and runs approximately 2-3 times slower. 



VI. COMPUTATIONAL COSTS 

For completeness, we include a brief summary of the com- 
putational costs incurred in extracting, from a given unitary 
tensor network, the expectation value of a local operator by 
using (i) exact contraction, (ii) Markov chain Monte Carlo 
and (iii) a perfect sampling scheme. For simplicity, we con- 
sider only one-site local operators. The scaling of the costs 
in the bond dimension \ is presented in Table J] We empha- 
size that in the sampling schemes, we only consider the cost 
of obtaining one sample. A fair comparison of costs with an 
exact contraction should also take into account the number of 
samples required in order to approximate the exact result with 
some pre-agreed accuracy. 

The table shows that for both a uMPS and the MERA, the 
cost of Markov chain Monte Carlo and perfect sampling scale 
with the same power. Instead, for the uTTN, the of Markov 
chain Monte Carlo is one power smaller than that of perfect 
sampling. [The same would happen with uMPS if the local di- 
mension of each site was also xl- More significant speed-ups 
can be seen with the MERA, both for the computation of two- 
point correlators, and in systems in two dimensions (not in the 
Table), where sampling techniques to increase computational 
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Number of samples 



FIG. 9: (Color online) Sampling errors with the incomplete perfect 
sampling scheme for a 50 site critical Ising chain, using both perfect 
sampling (continuous lines) and Markov chain Monte Carlo sam- 
pling (dots), (a) Sampling errors in the computation of (vE'lcrfsl^). 
With perfect sampling, errors in the incomplete perfect sampling 
scheme are upper-bounded by the errors in a complete sampling 
scheme, as proven in the Appendix. Interestingly, for estimates of 
(a z ) the incomplete perfect sampling scheme obtains an error 10 -7 
times smaller by measuring in the x basis on sites 1, 2, • • • ,L°. (b) 
Sampling errors in the computation of ] <5"2 5 1 "3/) . Again, the errors 
with incomplete perfect sampling are smaller than those with com- 
plete perfect sampling, and depend on the choice of product basis. 



efficiency are required most. The authors present an in-depth 
analysis of perfect sampling with the MERA in Ref. Ugl 

A further remark is in order. The above analysis assumes 
that a tensor network has been provided in a unitary circuit 
form. In particular, the costs in Table Q] do not include op- 
erations such as converting a non-unitary version of the ten- 
sor network into its unitary form (typically through the QR- 
decomposition). In particular, the cost of QR-decompositions 
required to turn an MPS into a uMPS scales as 0(\ 3 ) - that 
is, the same scaling as an exact contraction. What is then the 
practical interest in a perfect sampling scheme for a uMPS? 
On the one hand, the uMPS might conceivably have been gen- 
erated through some procedure (e.g. along the lines of the 
algebraic Bethe Ansatz MPS constructions described in Ref. 
1301) . with a cost 0(x 2 ) (notice that a uMPS tensor only con- 
tains 0(x 2 ) coefficients). In this case, the perfect sampling 
scheme would allow for a very efficient, approximate evalu- 
ation of expectation values without increasing this cost. On 



TABLE I: The leading-order costs of contracting unitary tensor net- 
works with and without sampling techniques, with the goal of esti- 
mating the expectation value of a one-site operator. For the MERA 
we have also included the cost calculating arbitrary (long-range) two- 
point correlators.— 



Tensor 


Exact 


Markov- 


Perfect 


network 


contraction 


chain MC 


sampling 


uMPS (open BC) 


o(x :i ) 


o(x 2 ) 


o(x 2 ) 


uTTN (binary) 


o(x 4 ) 


o(x 2 ) 


o(x 3 ) 


MERA (ID binary) 


o(x 9 ) 


o(x 5 ) 


o(x') 


— >• 2-point correlators 


o(x 12 ) 


o(x 7 ) 


o(x s ) 



the other hand, although we have focused our analysis on the 
evaluation of local expectation values, more complex tasks in- 
volving a uMPS, such as the computation of entanglement en- 
tropy, can exploit the perfect sampling schemes presented in 
this paper at a cost significantly lower than that of an exact 
contraction (see e.g. Ref. [3ll) . 



VII. CONCLUSIONS 

We have explained how to perform Monte Carlo sampling 
on unitary tensor networks such as the MERA, uMPS and 
uTTN. In order to compute the expectation value (^| of 
a local operator A, sampling is performed on the past causal 
cone C of operator A. In addition, by exploiting the uni- 
tary character of the tensors, it is possible to directly sample 
configurations r of the causal cone according to their weight 
in the wave-function, resulting in uncorrelated samples and 
thus avoiding the equilibration and autocorrelation times of 
Markov chain Monte Carlo schemes. This last property makes 
the perfect sampling scheme particularly interesting to study 
critical systems. 

In principle, one can also proceed as in Eqs. ( f2TI - [28T > for 
non-unitary tensor networks, e.g. PEPS, and obtain perfect 
sampling. However, in non-unitary tensor networks the cost 
of computing e.g. p\ is already the same as that of computing 
the expectation value 1 ^4 1 M/) without sampling. Therefore 
perfect sampling in non-unitary tensor networks seems to be 
of very limited interest. 

Here we have only considered sampling in the context of 
computing expectation values. However, the same approach 
can also be applied in order to optimize the variational ansatz, 
as discussed in full detail in Ref. |29j for the MERA. 

The authors thank Glen Evenbly for useful discussions. 
Support from the Australian Research Council (FF0668731, 
DP0878830, DP1092513), the visitor programme at Perime- 
ter Institute, NSERC and FQRNT is acknowledged. 
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Appendix A: Variance with complete and incomplete sampling 



2. Mean and variance with incomplete sampling 



Given a vector | ^) € Yc and a local operator A, the expec- 
tation value of A is given by (4 , |^4|^') and its variance is 



o\ = (*|(|i-(M/|A|vI/)| 2 )|*) 



*| l\A\* |tt)-|<*H*>l' 



(Al) 
(A2) 



1. Mean and variance with complete sampling 

Consider the complex random variable (A(s), P(s)), where 
A(s) is the estimator 

(j^Ksji^ _ (gjjjtt) .... 

A(s) - (*|s>(si*) - i^*r (A3) 

and Q(s) is the probability 

Q(s) = (*|s)(s|tt). (A4) 

Here {|s)} denotes an orthonormal basis in the vector space 
Yc- Notice that |s) (s| is a resolution of the identity in Yc 
and therefore J2 S Q( s ) = = l - 

The mean A is given by the expectation value 



x;(*is>(s|ii*) = (*iii*). 



(A5) 



In turn, its variance a\, 

o\ = ^Q(s)|A(s)-A| 2 (A6) 

S 

= ^Q(s)|A(s)| 2 -|A| 2 , (A7) 



equals the variance a 2 ^ of operator A, as can be seen from 



= Y,(M^\s)(s\A\^) = ^\(\A\A |* 



(A8) 



Consider now a new complex random variable 
(A(s), Q(s)), where A(s) is the estimator 



A(s) 



(*|tt(b)A|*) 

(*k(s)l*> 



(A9) 



and Q(s) is the probability 

Q(s) = (*|tt(s)|*>. (A10) 
Here {7r(s)} denotes a complete set of projectors on the vector 
space Yc, that is 7r(s) 2 = 7r(s), and 7r(s) is a resolution 
of the identity in Y c , so that J2 S Q( s ) = = 1- Notice 

that if all the projectors 7r(s) have rank one, then we recover 
the situation analyzed in the previous subsection. Notice also 
that this more general setting includes the case addressed in 
Sect. [V]in the context of incomplete sampling. 

The mean A is again given by the expectation value 

(mm, 



A = Vq W a(.) = EW»W?t^ 



(All) 



However, this time the variance u\ is only upper bounded by 
the variance er^ of operator A. This follows from, 

s 

^ K 1 yn 1 (*|tt(s)|*) (*K(s)|*) 

_ (*lit 7r ( s )l^)($l 7r ( s )il^f) 

< E(*I^ t7T (s)^I*) = (*l (l^l 2 ) I*)- ( A12 ) 



Here, the inequality follows from (x\y)(y\x) < (x\x)(y\y) 
with the identifications \x) = k(s)A\^) and \y) = 7r(s)|\l/). 



1 S.R. White, Phys. Rev. Lett. 69, 2863 (1992). 

2 S.R. White, Phys. Rev. B, 48, 10345 (1993). 

3 M. Fannes, B. Nachtergaele, and R. F. Werner, Commun. Math. 
Phys. 144, 443 (1992). 

4 S. Ostlund and S. Rommer, Phys. Rev. Lett. 75, 3537 (1995). 

5 G. Vidal, Phys. Rev. Lett., 91, 147902 (2003) 

6 D. Perez-Garcia, F. Verstraete, M. M.Wolf, and J. I. Cirac, Quant. 
Inf. Comput. 7, 401 (2007). 



7 U. Schollwoeck, Rev. Mod. Phys., 77, 259 (2005). 

8 U. Schollwoeck, Ann. of Phys. 326, 96 (201 1). 

9 G. Vidal, Phys. Rev. Lett., 99, 220405 (2007); G. Vidal, Phys. 
Rev. Lett., 101, 110501 (2008). 

10 G. Vidal, in Understanding Quantum Phase Transitions, edited 
by L. D. Carr (Taylor & Francis, Boca Raton, 2010), 
|arXiv:0912.165TV 2. 

11 F. Verstraete, and J. I. Cirac. larXiv:cond-m at/0407066 vl (2004). 



11 



12 G. Sierra and M.A. Martin-Delgado, |arXiv:con d-mat/981 iT70V 3 
(1998). 

13 T. Nishino and K. Okunishi, J. Phys. Soc. Jpn., 67, 3066, 1998. 

14 V. Murg, F. Verstraete, and J. I. Cirac, Phys. Rev. A, 75, 033605 

(2007) . 

J. Jordan, R. Orus, G. Vidal, F. Verstraete, and J. I. Cirac, Phys. 
Rev. Lett., 101, 250602 (2008). 

16 Z.-C. Gu, M. Levin, and X.-G. Wen, Phys. Rev. B, 78, 205116 

(2008) . 

17 H. C. Jiang, Z. Y. Weng, and T. Xiang, Phys. Rev. Lett., 101, 
090603 (2008). 

18 G. Evenbly and G. Vidal, Phys. Rev. B, 81, 235102 (2010). 

19 G. Evenbly and G. Vidal, New J. Phys., 12, 025007 (2010). 

20 L. Cincio, J. Dziarmaga, and M. M. Rams Phys. Rev. Lett., 100, 
240603 (2008). 

21 G. Evenbly and G. Vidal, Phys. Rev. Lett., 102, 180406 (2009). 

22 Exploitation of space symmetries (e.g. translation invariance and 
scale invariance) can significantly reduce the computational cost 
of simulations from O(L) to 0(log L) or even to a constant (in- 
dependent of L), allowing to reach the thermodynamic limit. 

23 N. Schuch, M.M. Wolf, F. Verstraete, J.I. Cirac, Phys. Rev. Lett. 
100, 040501 (2008) 



31 



A. W. Sandvik, G. Vidal, Phys. Rev. Lett. 99, 220602 (2007). 



24 

25 S. R. White, Phys. Rev. Lett. 102, 190601 (2009). 

26 Y. Y. Shi, L.-M. Duan and G. Vidal, Phys. Rev. A, 74, 022320 
(2006). 

27 L. Tagliacozzo, G. Evenbly, and G. Vidal, Phys. Rev. B 80, 
235127 (2009). 

28 E. M. Stoudenmire and S. R. White, New J. Phys. 12, 055026 
(2010). 

29 A. J. Ferris, G. Vidal, e-print |arXiv:1201.3975l (2012). 

30 H. Katsura, I. Maruyama, J. Phys. A: Math. Theor. 43, 175003 



(2010). V. Murg, V. E. Korepin, F. Verstraete, arXiv:1201.5636 
(2012) and larXiv: 1 201 .5627 1 (20 1 2) . 
L. Cincio, G. Vidal, in preparation. 

32 From an MPS (TTN) for a state with given bond dimension 
X, one can always use the gauge freedom in these tensor networks 
to obtain a unitary MPS (respectively TTN) for the same state 

>]/) and with the same bond dimension \> by writing the tensor 
network in its canonical form^. 

33 We have found some tensor networks for which perfect sam- 
plin g co sts one more power of \ th an a Markov chain sweep, see 
Ref. [23. Further, for local operators supported on two or more 
sites, perfect sampling may incur a slightly larger cost than that of 
a single sweep of Markov chain Monte Carlo. For example, this 
occurs with a TTN, where perfect sampling of a two-site operator 
costs x 4 instead of x 3 , but not with MPS or MERA. 

34 The transverse Ising model contains a Z2 symmetry as the oper- 
ator \\ i Xi commutes with the Hamiltonian. For a wave-function 
chosen from one of the ±1 sectors of this operator, after making a 
projective measurement of all the spins in the 2>basis, one always 
gets an even (odd) number number of spin downs (actually, lefts). 
The overlap after flipping a single spin of such a configuration is 
always zero, and thus a two-site Markov chain update scheme that 
can preserve parity is required. 

35 Our Hamiltonian is actually defined in terms of bond operators, 
leading to effectively half the magnetic field at the ends of the 
chain. This change does not affect any of the critical properties of 
the system. 

36 We assume that the variance (*|(ol /2 ) 2 |>I') - ((*|<5-£ /2 |*)) 2 , 
which is upper bounded by 1, is essentially constant as a function 
of the system size L. 



